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KCNPAHAMKTRIC AL00PJTHM5 FOR THE 5EARCH OP SIONALP 


T.»S. Myphenkova 

Academy of Rclencea UPSR, Institute of Rpace Research 


Introduction 

The purpose of this paper Is to develop and .lustlfy (In- ^1* 
sofar as possible) nonparanetrlc alRorlthms for the seerch of 
slp-nals and whelr Isolation. Durlnp their mathematical for- 
mulation, these problems require certain a priori data about 
the slm.nal and noise. A priori data narrows down the search 
(provided It does not reflect the properties of all signals) 
and allows to Isolate only a part of sir.nals whose character- 
istics satisfy restrictions corresponding to this data. 

The search proper assumes that the time, shape, duration, 
power etc. of the signal which appears In the receiving device 
Is unknown. 

Thus, on one hand, by specifying certain noise and signal 
characteristics with the aim of Isolatl »g the latter (using 
parametric methods based, for example, on threshold character- 
istics constructed on the basis of a priori data), we are not 
solvli.g tbe search problem In general — we are solving the search 
problem for a completely defined class of signals. On the 
other hand, by reducing the volume of a priori data to a cer- 
tain level, we can no longer apply the well-developed para- 
metric m.cthods, which compared with nonparanetrlc methods, 
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air.cntf other thlnps, use up less machine time and computer 
memory . 


/i 


I. Test Pased on Spearman *s Rank Correlation Coefficient 

Let us consJdei* a discrete time — dependent stochastic 
process, l.e. 

f: , ; ( 1 ) 

- ti = - CO^st . 

( 2 ) 


V = 


l-et us In* reduce random variables characterising the form 
of the process F In a sample of size n 

i , F: f Fj 

O. Fi >F- (3) 

n. 

Sl = H ^ij . ' w 

It Is easily seen that 61 equals the number of readings 
F from a sample of size n which are not smaller than a given 
reading F^ from the same sample. 


Suppose that ^ ^ ^ . 

«/./i Is 

mQ<n), l.e. In the plvsn sample, 

largest In magnitude. 


attained at 1«1_ (m*l , 
m ’ 

the readings Fi are 
m 


2 


The following quantity 


( 5 ) 


s.r.\ 


I 


i. 




I 


characterizes the distance between PI and Pi . 

m 

Thus, two random variables 61 and >1 are associated with 
each random variable PI. Clearly If the readings PI, 1«1 , 2, 
n are mutually Independent random variables, the same 
statement can also be made about the random variables 61 and 
>1 (each separately). 


It Is assumed that one characteristic of a signal from a 
point source Is the presence of periodicity In readings arriving 
In the detector. The question with which period (or periods) 
the Intensity of the arriving signal varies Is a problem In 
spectral analysis of a detected signal. Our problem Is a search 
of signals. 

Prom the presence of periodicities during variation In the 
Intensity of radiation from sources which has been ascertained 
In many studies we Infer that the signal has a ’’non-random” 
deterministic form. This fact will be used In constructing 
a search algorithm. 


The search algorithm Is constructed on the basis of the 
so-called nonparamotrlc Independence test. This test Is con- 
venient In that unlike most decision problems, It does not re- 
quire a knowledge of the type of distribution of the studied 
process . 

This test Is based on the Spearman rank correlation co- 
efficient . 
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A rule which stalsfles this test Is as follows: Suppose /£ 

that only the following Is known shout a random sample of n 
pairs (Xj^, ^ 2 * ^ 2 * * ^n* ^n^* occupies the position 

Aj^ In order of decreaslnp magnitude, and Yj^ occupies the 
position In order of decreasing magnitude (k-1, 2, n). 

If the random variables X and Y are Independent, the statistic 




/X (n*- 


r 


trr i 


(fl.-B.y 


( 6 ) 


has an asymptotically normal distribution with mean 0 and variance 
1/n-l as n-»*. 


In our case, the random variable constructed above rep- 
resents the position number of the random variable Fj^ In order 
of decreasing magnitude, and the random variable which 
characterizes the neutral position of random variables F, rep- 
resents the position number of the random variable F,^ ordered 

by Its distance from Fi . 

" m 


This criterion allows to accept or re,1ect the Independence 
hypothesis at a particular significance level. Specifically, 
for any n>l, the Independence hypothesis Is rejected at signifi- 
cance level ^ a If 


R > 




(one-sided test) 


\iil > 


d/t 

(two-sided test) 


( 7 ) 


where Is the (l-a)-th quantile for the random variable u 

which Is normally distributed with mean 0 and variance 1/n-l. 



In this approach the only restriction Imposed on the 
noise process Is the statistical Independence of the random 
variables 6 and y (or a very low known correlation allowing to 
specify a nonarbltrarlly ) . Clearly this restriction Is equiva- 
lent to the requirement of mutually Independent readings. Be- 
cause of Its simplicity, such a model Is used sufficiently /J_ 

widely. The real question Is: Is It realistic? 

2. Algorithm Using One-Sided Sign Test 

Let us consider another approach to the search problem. 

In the absence of a sufficient (or even necessary statistic), 
no a priori statement can be made about signal characteristics 
required for the search. Such statistics as sample means and 
the standard deviation will not be used to test one hypothesis 
or another about the distribution, since: 

a) the search proper assumes that the dlstrlb'^tlon of the 
signal Is unknown; 

b) the statistical characteristics of the noise do not 
remain constant — the process Is nonstationary; 

c) although the distribution of the random variable 

(concretely the number of particles which arrived in the detector 
during time approaches asymptotically the normal 

distribution (Central Limit Theorem) when At Is Increased (see 
[2]), which Justifies one hypothesis, signals whose duration Is 
less than At are no longer detected. 

One signal characteristic which will be Included In the 
search Is Its duration, which among otlier things, depends on 
the conditions In which the experiment Is carried out. 
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Thus let 


rt 


Fi = 7 ? Z Fi^K , 


(6) 




JLiirt ‘ ^i*ryt “ 


( 9 ) 


(suppose Tg^ denotes the time during which the stochastic process 
P can be regarded as stationary. Then a natural bound on ^ Is 



( 10 ) 

/§. 


where [ x] Is the Integer part of the number X). 


Next , 


V.>, 


ci»rr> > O 
O , oL irn < ^ 


(11) 



( 12 ) 


i . c * 'y»~A/. 

N Is the assumed duration of the signal and n Is the size of the 
sample over which the sample means must satisfy 


/s/ < rL < 7o . ( 13 ) 

It can be easily verified that the assumptions made (about noise) 

In regard to: 
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a) the symmetry of the random variable 

b) the Independence of the random variables (readings) Fj^, 

c) the fact that 

^ i ^ O 

Imply 


9{jiin,>0}= ciirr, <0} ^ 

/nouc 

T- [pu > p/ooue j = Z2 ( 'z) ■ fCp). 


(ii») 


( 16 ) 


Items a), b), c) constitute the tested Hq hypothesis. /£ 

Let P be the smallest value of P for which 

Cl 

p ( f>) ^ cL (17) 

Then the one-sided test (so-called sign test) requires rejection 
of the Hq hypothesis at significance level -a If 

/- 4 « ^ ( 18 ) 

Thus, the algorithm based on this test presupposes that: 

1) the noise process satisfies a), b), c). 

2) any significant develatlon from noise (last formula (17)) 

Is the signal. 
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3. Use of Simplified Statistics 


We designed and debugged programs written In Fortran TV 
which are utilised to test Item 1) for various noise processes 
(using data recorded on telemetry tapes). 


In the given algorithm, s*-.vt?stlc (12) requires a large 
number of time consuming calculations. From this standpoint, 
the algorithm can be substantially simplified If one of the 
following random variables Is Introduced Instead of a (see 
[9]) and y (see [11]) : 

either ^ 

r F; f Fj 

^0 ~ I 0, F; > 

where 

l.J : i I -J \ * 


(19) 


or 


* ^ i ^ * t 

■ i 0, Fi > Fi.r , . 

The corresponding statistics are 

a/ 

fifi - ^ 4.'/ > 

K i * ^ ^ a / ^ y« / 
or , 

* T 6:, 

where N as before. Is assumed to be the duration of the signal, 
k Is a current value of the subscript satisfying the Inequality 


( 20 ) 

/IC 


( 21 ) 


( 22 ) 


Ki‘A/ ^ y, . 
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It is clear that the test formulated atove for teatlrp 
the Kq hypotheals can be applied to each of these statistics. 

A propram written In Fortran IV was developed and debupped 
for calculatlnp statistics (21), (22). 

Results of Computerized Trocesslng of Various Experimental Data 

Pape gives the program for calculating the statistic 
R/n-1 (see [6]). The results of the calculations are Illustrated 
In Fig. 1. The array R corresponds to this statistic In the 
program. The calculations were carried out for the case n«30. 

The noise process and the process containing the signal were 
read Into the program In the form of samples, each consisting 
of 50 readings. Altogether 20 values were obtained frem the 
array R (P(k), k-T7?T5) for e-ch case, l.e.-20 samples, each con- 
sisting of 30 readings formeu from 50 readings that were read In. 

The given significance level a can be specified as follows: / II 

d. ; c ^)l )• 

*fki2C 

From Fig. 1 It Is evident that In the case of the signal, R In 
abs<^lute value Is greater than the R value for noise nearly 
everywhere . 

Page 13 gives program No. which calculates the 
N 

statistics 0.^ (corresponding to array MI) and P(p) (corresponding 

N 

to array FR). The statistic was calculated for N»20. 

The arrays that were read In (noise and signal) had the 
same dimension as In program No. 1, l.e. 50. Fig. 2 Illustrates 
(besides the obtained curves) the operation of the algorithm; 
a Is given (horizontal line) and Pa Is obtained (see 17). It Is 


9 


evident that If the al^nlflcance level o is apeclfled In thla 
manner, nearly the entire alpnal Ilea In the rep;lon > Fo, 
which alrnlflea the presence of t' j latter. 

'’onclustor. 

The fourth section presents the results of calculations 
of the statistics B/n-1 (corresponding to array R In program No. 
1), (corresponding to array Ml In Program No. ?) and of the 
value cf P given by expression (16) (corresponding to array PR 
In program No. 2) for noise processes and processes containing 
the signal (data from telemetry tapes— ’’Pllln experiment"). 

In [ 7 ] this signal was Isolated In a different way using the 
correlation method. 

The preliminary operations Involving centering and 
summing centered readings over four channels used up m.ore machine 
time than processing of the sam.e array In this study. For a 
given significance level more accurate data on utilized machine 
time can be used to compare the efficiencies of the discussed 
algorithms with each other and with the algorithm In reference 
[7]. 


In conclusion the author expresses his deep gratitude to 
the pioneer of this study, V.I. Berezin and also to Ye.N. 

Kiseleva for her great contribution made In writing and debugging 
programs, processing data and formating results. 
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